Noise Compensation for Speech Recognition Using Subspace Gaussian Mixture Models
نویسندگان
چکیده
In this paper, we adress the problem of additive noise which degrades substantially the performances of speech recognition system. We propose a cepstral denoising based on the Subspace Gaussian Mixture Models paradigm (SGMM). The acoustic space is modeled by using a UBM-GMM. Each phoneme is modeled by a GMM derived from the UBM. The concatenation of the means of a given GMM leads to a very high dimention space, called the supervector space. The SGMM paradigm allows to model the additive noise as an additive component located in a subspace of low dimension (with respect to the supervector space). For each speech segment, this additive noise component is estimated in a model space. From this estimation, a specific frame transformation is obtained and applied to such a data frame. In this work, training data are assumed to be clean, so the cleaning process is applied only on test data. The proposed approach is tested on data recorded in a noisy environment and also on artificially noised data. With this approach we obtain, on data recorded in a noisy environment, a relative WER reduction of 15%.
منابع مشابه
Noise Compensation for Subspace Gaussian Mixture Models
Joint uncertainty decoding (JUD) is an effective model-based noise compensation technique for conventional Gaussian mixture model (GMM) based speech recognition systems. In this paper, we apply JUD to subspace Gaussian mixture model (SGMM) based acoustic models. The total number of Gaussians in the SGMM acoustic model is usually much larger than for conventional GMMs, which limits the applicati...
متن کاملSpeech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering
Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equatio...
متن کاملNoise adaptive training for subspace Gaussian mixture models
Noise adaptive training (NAT) is an effective approach to normalise environmental distortions when training a speech recogniser on noise-corrupted speech. This paper investigates the model-based NAT scheme using joint uncertainty decoding (JUD) for subspace Gaussian mixture models (SGMMs). A typical SGMM acoustic model has much larger number of surface Gaussian components, which makes it comput...
متن کاملJoint uncertainty decoding with unscented transform for noise robust subspace Gaussian mixture models
Common noise compensation techniques use vector Taylor series (VTS) to approximate the mismatch function. Recent work shows that the approximation accuracy may be improved by sampling. One such sampling technique is the unscented transform (UT), which draws samples deterministically from clean speech and noise model to derive the noise corrupted speech parameters. This paper applies UT to noise...
متن کاملSpeech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty
In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...
متن کامل